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Method for distinguishing immunologically defined ALL subtypes 

5 The present invention is directed to a method for distinguishing immunologically 
defined ALL subtypes by determining the expression level of selected marker 
genes. 

Leukemias are classified into four different groups or types: acute myeloid (AML), 

10 acute lymphatic (ALL), chronic myeloid (CML) and chronic lymphatic leukemia 
(CLL). Within these groups, several subcategories can be identified further using a 
panel of standard techniques as described below. These different subcatgories in 
leukemias are associated with varying clinical outcome and therefore are the basis 
for different treatment strategies. The importance of highly specific classification 

15 may be illustrated in detail further for the AML as a very heterogeneous group of 
diseases. Effort is aimed at identifying biological entities and to distinguish and 
classify subgroups of AML which are associated with a favorable, intermediate or 
unfavorable prognosis, respectively. In 1976, the FAB classification was proposed 
by the French-American-British co-operative group which was based on 

20 cytomorphology and cytochemistry in order to separate AML subgroups according 
to the morphological appearance of blasts in the blood and bone marrow. In 
addition, it was recognized that genetic abnormalities occurring in the leukemic 
blast had a major impact on the morphological picture and even more on the 
prognosis. So far, the karyotype of the leukemic blasts is the most important 

25 independent prognostic factor regarding response to therapy as well as survival. 

Usually, a combination of methods is necessary to obtain the most important 
information in leukemia diagnostics: Analysis of the morphology and 
cytochemistry of bone marrow blasts and peripheral blood cells is necessary to 

30 establish the diagnosis. In some cases the addition of immunophenotyping is 
mandatory to separate very undifferentiated AML from acute lymphoblastic 
leukemia and CLL. Leukemia subtypes investigated can be diagnosed by 
cytomorphology alone, only if an expert reviews the smears. However, a genetic 
analysis based on chromosome analysis, fluorescence in situ hybridization or RT- 

35 PCR and immunophenotyping is required in order to assign all cases in to the right 
category. The aim of these techniques besides diagnosis is mainly to determine the 
prognosis of the leukemia. A major disadvantage of these methods, however, is that 
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viable cells are necessary as the cells for genetic analysis have to divide in vitro in 
order to obtain metaphases for the analysis. Another problem is the long time of 72 
hours from receipt of the material in the laboratory to obtain the result. 
Furthermore, great experience in preparation of chromosomes and even more in 

5 analyzing the karyotypes is required to obtain the correct result in at least 90% of 
cases. Using these techniques in combination, hematological malignancies in a first 
approach are separated into chronic myeloid leukemia (CML), chronic lymphatic 
(CLL), acute lymphoblastic (ALL), and acute myeloid leukemia (AML). Within 
the latter three disease entities several prognostically relevant subtypes have been 

10 established. As a second approach this further sub-classification is based mainly on 
genetic abnormalities of the leukemic blasts and clearly is associated with different 
prognoses. 

The sub-classification of leukemias becomes increasingly important to guide 

15 therapy. The development of new, specific drugs and treatment approaches requires 
the identification of specific subtypes that may benefit from a distinct therapeutic 
protocol and, thus, can improve outcome of distinct subsets of leukemia. For 
example, the new therapeutic drug (STI571, Imatinib) inhibits the CML specific 
chimeric tyrosine kinase BCR-ABL generated from the genetic defect observed in 

20 CML, the BCR-ABL-rearrangement due to the translocation between 
chromosomes 3 and 22 (t(9;22) (q34; qll)). In patients treated with this new drug, 
the therapy response is dramatically higher as compared to all other drugs that had 
been used so far. Another example is the subtype of acute myeloid leukemia AML 
M3 and its variant M3v both with karyotype t(15;17)(q22; qll-12). The 

25 introduction of a new drug (all-trans retinoic acid - ATRA) has improved the 
outcome in this subgroup of patient from about 50% to 85 % long-term survivors. 
As it is mandatory for these patients suffering from these specific leukemia 
subtypes to be identified as fast as possible so that the best therapy can be applied, 
diagnostics today must accomplish sub-classification with maximal precision. Not 

30 ' only for these subtypes but also for several other leukemia subtypes different 
treatment approaches could improve outcome. Therefore, rapid and precise 
identification of distinct leukemia subtypes is the future goal for diagnostics. 

Thus, the technical problem underlying the present invention was to provide means 
35 for leukemia diagnostics which overcome at least some of the disadvantages of the 
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prior art diagnostic methods, in particular encompassing the time-consuming and 
unreliable combination of different methods and which provides a rapid assay to 
unambigously distinguish one subtype from another, e.g. by genetic analysis. 

5 According to Golub et al. (Science, 1999, 286, 53 1-7), gene expression profiles can 
be used for class prediction and discriminating AML from ALL samples. However, 
for the analysis of acute leukemias the selection of the two different subgroups was 
performed using exclusively morphologic-phenotypical criteria. This was only 
descriptive and does not provide deeper insights into the pathogenesis or the 

10 underlying biology of the leukemia. The approach reproduces only very basic 
knowledge of cytomorphology and intends to differentiate classes. The data is not 
sufficient to predict prognostically relevant cytogenetic aberrations. 

Furthermore, the international application WO-A 03/039443 discloses marker 
15 genes the expression levels of which are characteristic for certain leukemia, e.g. 
AML subtypes and additionally discloses methods for differentiating between the 
subtype of AML cells by determining the expression profile of the disclosed 
marker genes. However, WO-A 03/039443 does not provide guidance which set of 
distinct genes discriminate between two subtypes and, as such, can be routineously 
20 taken in order to distinguish one ALL subtype from another. 

The problem is solved by the present invention, which provides a method for 
distinguishing immunologically defined ALL subtypes Pro-B-ALL, c-ALL, Pre-B- 
ALL, c-ALL/Pre-B-ALL, mature B-ALL, precursor B-ALL, Pro-T-ALL, Pre-T- 
25 ALL, cortical T-ALL, mature T-ALL, and/or T-ALL in a sample, the method 
comprising detennining the expression level of markers selected from the markers 
identifiable by their Affymetrix Identification Numbers (affy id) as defined in 
Tables 1 and or 2, 

wherein 

30 a lower expression of at least one polynucleotide defined by any of the 

numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table 1.1 



WO 2005/045437 PCT/EP2004/0 12459 



is indicative for the presence of ball when ball is distinguished from all 
other subtypes, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
5 numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 

21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table 1.2 

is indicative for the presence of cpre when cpre is distinguished from 
all other subtypes, 

10 and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 6, 8, 9, 10, 12, 13, 14, 16, 17, 18, 22, 23, 24, 25, 30, 31, 
34, 38, 40, 42, 43, 44, 46, 48, and/or 49, of Table 1.3 and/or 

a higher expression of at least one polynucleotide defined by any of the 
15 numbers 4, 5, 7, 11, 15, 19, 20, 21, 26, 27, 28, 29, 32, 33, 35, 36, 37, 39, 

41, 45, 47, and/or 50 of Table 1.3 

is indicative for the presence of cpreh when cpreh is distinguished from 
all other subtypes, 

and/or wherein 

20 a lower expression of at least one polynucleotide defined by any of the 

numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41, 42, 
43, 44, 45, 46, 47, and/or 48, of Table 1.4, and/or 

a higher expression of at least one polynucleotide defined by any of the 
25 numbers 16, 22, 39, 49, and/or 50 of Table 1.4 

is indicative for the presence of kort when kort is distinguished from all 
other subtypes, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
30 numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 

21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table 1.5 
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is indicative for the presence of pret when pret is distinguished from all 
other subtypes, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
5 numbers 1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 21, 25, 26, 

27, 28, 29, 32, 35, 36, 37, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 
and/or 50 of Table 1.6, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 5, 7, 10, 20, 22, 23, 24, 30, 31, 33, 34, and/or 39 of Table 1.6, 

10 is indicative for the presence of prob when prob is distinguished from 

from all other subtypes, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 3, 4, 8, 10, 12, 15, 17, 20, 23, 24, 25, 27, 28, 29, 30, 31, 34, 
15 36, 37, 40, 42, 44, 45, 46, 49, and/or 50 of Table 2.1, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 2, 5, 6, 7, 9, 11, 13, 14, 16, 18, 19, 21, 22, 26, 32, 33, 35, 38, 
39, 41, 43, 47, 48, 

is indicative for the presence of ball when ball is distinguished from 
20 cpre, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 38, 39, 40, 41, 
25 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table of Table 2.2, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 26, and/or 37, of Table 2.2 

is indicative for the presence of ball when ball is distinguished from 
cpreph, 

30 and/or wherein 



a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 
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22, 23, 24, 25, 26, 28, 30, 31, 33, 34, 36, 37, 38, 39, 40, 41, 42, 43, 45, 
46, 47, 48, and/or 49, of Table 2.3, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 6, 7, 27, 29, 32, 35, 44, and/or 50 of Table 2.3 

5 is indicative for the presence of ball when ball is distinguished from 

kort, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 3, 5, 6, 7, 13, 17, 18, 19, 21, 22, 26, 27, 30, 32, 34, 36, 38, 40, 
10 47, and/or 48, of Table 2.4, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 4, 8, 9, 10, 11, 12, 14,15, 16, 20, 23, 24, 25, 28, 29, 
31,33, 35,37, 39, 41, 42, 43, 44, 45, 46, 49, and/or 50 of Table 2.4 

is indicative for the presence of ball when ball is distinguished from 
15 pret, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 31, 32, 33, 34, 35, 36, 37, 38, 40, 41, 
20 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table of Table 2.5, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 29, 30 and/or 39, of Table 2.5, 

is indicative for the presence of ball when ball is distinguished from 
prob, 

25 and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 7, 9, 10, 11, 13, 17, 18, 21, 24, 25, 27, 29, 30, 31, 
36, 37, 38, 40, 42, 43, 45, 46, 49, and/or 50 of Table 2.6, and/or 

a higher expression of at least one polynucleotide defined by any of the 
30 numbers 6, 8, 12, 14, 15, 16, 19, 20, 22, 23, 26, 28, 32, 33, 34, 35, 39, 

41 1 44, 47, and/or 48 of Table 2.6, 
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is indicative for the presence of cpre when cpre is distinguished from 
cpreph, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
5 numbers 1, 2, 4, 5, 6, 8, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 23, 

24, 25, 27, 28, 29, 30, 31, 32, 35, 36, 38, 40, 41, 43, 44, 45, 46, 48, 49, 
and/or 50 of Table 2.7, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 3, 7, 9, 11, 22, 26, 33, 34, 37, 39, 42, 47, of Table 2.7, 

10 is indicative for cpre when cpre is distinguished from kort, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 20, 28, 31, 37, 38, and/or 50 of Table 2.8, and/or 

a higher expression of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 
15 17, 18, 19, 21, 22, 23, 24, 25, 26, 27, 29, 30, 32, 33, 34, 35, 36, 39, 40, 

41, 42, 43, 44, 45, 46, 47, 48, and/or 49 of Table 2.8 

is indicative for cpre when cpre is distinguished from pret, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
20 numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 

20, 21, 22, 23, 24, 25, 27, 28, 29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 
42, 43, 44, 45, 46, 47, 48, and/or 50 of Table 2.9, 

a higher expression of at least one polynucleotide defined by any of the 
numbers 26, 33, 41, and/or 49 of Table 2.9 

25 is indicative for cpre when cpre is distinguished from prob, 

and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 3, 6, 12, 17, 23, 28, 34, 35, and/or 41, of Table 2.10, and/or 



30 



a higher expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 4, 5, 7, 8, 9, 10, 11, 13, 14, 15, 16, 18, 19, 20, 21, 22, 24, 
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25, 26, 27, 29, 30, 31, 32, 33, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 47, 
48, 49, and/or 50 of Table 2.10 

is indicative for cpreph when cpreph is distinguished from kort, 
and/or wherein 

5 a lower expression of at least one polynucleotide defined by any of the 

numbers 42, and/or 43, of Table 2. 1 1 , and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 
10 39, 40, 41, 44, 45, 46, 47, 48, 49, and/or 50 of Table 2.1 1, 

is indicative for cpreph when cpreph is distinguished from pret, 
and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 3, 5, 8, 9, 11, 12, 13, 15, 18, 21, 24, 27, 28, 29, 32, 34, 36, 
15 38, 41, 42, 43, 46, 47, 48, of Table 2.12, and/or 

a higher expression of at least one polynucleotide defined by any of the 
numbers 2, 4, 6, 7, 10, 14, 16, 17, 19, 20, 22, 23, 25, 26, 30, 31, 33, 35, 
37, 39, 40, 44, 45, 49, and/or 50 of Table 2.12 
is indicative for cpreph when cpreph is distinguished from prob 
20 and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 19, and/or 40, of Table 2.13 

a higher expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 20, 
25 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 

41, 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table 2.13, 

is indicative for kort when kort is distinguished from pret, 
and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
30 numbers 1, 4, 7, 9, 10, 11, 13, 14, 15, 16, 17, 20, 21, 22, 28, 29, 31, 32, 

33, 35, 36, 37, 40, 41, 42, 43, 45, 47, 48, and/or 50 of Table 2.14, 
and/or 
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a higher expression of at least one polynucleotide defined by any of the 
numbers 2, 3, 5, 6, 8, 12, 18, 19, 23, 24, 25, 26, 27, 30, 34, 38, 39, 44, 
46, and/or 49, of Table 2.14 

is indicative for kort when kort is distinguished from prob, 
5 and/or wherein 

a lower expression of at least one polynucleotide defined by any of the 
numbers 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 
39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, and/or 50 of Table 2.15, 

10 is indicative for pret when pret is distinguished from prob. 



As used herein, the following abbreviations represent the classified 
immunologically defined ALL subtypes: 

ball=Mature B-ALL 

1 5 cpre=c-ALL/Pre-B-ALL without t(9;22) 

cpreph= c-ALL/Pre-B-ALL with t(9;22) 

kort=Cortical T-ALL 

pre1=Pre-T-ALL 

prob=Pro-B-ALL 

20 

According to the present invention, a "sample" means any biological material 
containing genetic information in the form of nucleic acids or proteins obtainable 
or obtained from an individual. The sample includes e.g. tissue samples, cell 
samples, bone marrow and/or body fluids such as blood, saliva, semen. Preferably, 
25 the sample is blood or bone marrow, more preferably the sample is bone marrow. 

The person skilled in the art is aware of methods, how to isolate nucleic acids and 
proteins from a sample. A general method for isolating and preparing nucleic acids 
from a sample is outlined in Example 3. 



30 According to the present invention, the term "lower expression" is generally 
assigned to all by numbers and Affymetrix Id. definable polynucleotides the t- 
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values and fold change (fc) values of which are negative, as indicated in the Tables. 
Accordingly, the term "higher expression" is generally assigned to all by numbers 
and Affymetrix Id. definable polynucleotides the t-values and fold change (fc) 
values of which are positive. 

5 

According to the present invention, the term "expression" refers to the process by 
which mRNA or a polypeptide is produced based on the nucleic acid sequence of a 
gene, i.e. „expression" also includes the formation of mRNA upon transcription. In 
accordance with the present invention, the term ,,detennining the expression level" 
10 preferably refers to the determination of the level of expression, namely of the 
markers. 

Generally, "marker" refers to any genetically controlled difference which can be 
used in the genetic analysis of a test versus a control sample, for the purpose of 

15 assigning the sample to a defined genotype or phenotype. As used herein, 
"markers" refer to genes which are differentially expressed in, e.g., different AML 
subtypes. The markers can be defined by their gene symbol name, their encoded 
protein name, their transcript identification number (cluster identification number), 
the data base accession number, public accession number or GenBank identifier or, 

20 as done in the present invention, Affymetrix identification number, chromosomal 
location, UniGene accession number and cluster type, LocusLink accession number 
(see Examples and Tables). 

The Affymetrix identification number (affy id) is accessible for anyone and the 
25 person skilled in the art by entering the "gene expression omnibus" internet page of 
the National Center for Biotechnology Information (NCBI) 
(http://www.ncbi.nlm.nih.gov/geo/). In particular, the affy id's of the 
polynucleotides used for the method of the present invention are derived from the 
so-called U133 chip. The sequence data of each identification number can be 
30 viewed at http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc^PL96 

Generally, the expression level of a marker is determined by the determ inin g the 
expression of its corresponding "polynucleotide" as described hereinafter. 



WO 2005/045437 



PCT/EP2004/012459 



-11- 

According to the present invention, the term polynucleotide" refers, generally, to a 
DNA, in particular cDNA, or RNA, in particular a cRNA, or a portion thereof or a 
polypeptide or a portion thereof. In the case of RNA (or cDNA), the polynucleotide 
is formed upon transcription of a nucleotide sequence which is capable of 
5 expression. The polynucleotide fragments refer to fragments preferably of between 
at least 8, such as 10, 12, 15 or 18 nucleotides and at least 50, such as 60, 80, 100, 
200 or 300 nucleotides in length, or a complementary sequence thereto, 
representing a consecutive stretch of nucleotides of a gene, cDNA or mRNA. In 
other terms, polynucleotides include also any fragment (or complementary 
10 sequence thereto) of a sequence derived from any of the markers defined above as 
long as these fragments unambiguously identify the marker. 

The determination of the expression level may be effected at the transcriptional or 
translational level, i.e. at the level of mRNA or at the protein level. Protein 

15 fragments such as peptides or polypeptides advantageously comprise between at 
least 6 and at least 25, such as 30, 40, 80, 100 or 200 consecutive amino acids 
representative of the corresponding full length protein. Six amino acids are 
generally recognized as the lowest peptidic stretch giving rise to a linear epitope 
recognized by an antibody, fragment or derivative thereof. Alternatively, the 

20 proteins or fragments thereof may be analysed using nucleic acid molecules 
specifically binding to three-dimensional structures (aptamers). 

Depending on the nature of the polynucleotide or polypeptide, the determination of 
the expression levels may be effected by a variety of methods. For determining and 
25 detecting the expression level, it is preferred in the present invention that the 
polynucleotide, in particular the cRNA, is labelled. 

The labelling of the polynucleotide or a polypeptide can occur by a variety of 
methods known to the skilled artisan. The label can be fluorescent, 

30 chemiluminescent, bioluminescent, radioactive (such as 3 H or 32 P). The labelling 
compound can be any labelling compound being suitable for the labelling of 
polynucleotides and/or polypeptides. Examples include fluorescent dyes, such as 
fluorescein, dichlorofluorescein, hexachlorofluorescein, BODIPY variants, ROX, 
tetramethylrhodamin, rhodamin X, Cyanine-2, Cyanine-3, Cyanine-5, Cyanine-7, 

35 IRD40, FluorX, Oregon Green, Alexa variants (available e.g. from Molecular 
Probes or Amersham Biosciences) and the like, biotin or biotinylated nucleotides, 
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digoxigenin, radioisotopes, antibodies, enzymes and receptors. Depending on the 
type of labelling, the detection is done via fluorescence measurements, conjugation 
to streptavidin and/or avidin, antigen-antibody- and/or antibody-antibody- 
interactions, radioactivity measurements, as well as catalytic and/or receptor/ligand 

5 interactions. Suitable methods include the direct labelling (incorporation) method, 
the amino-modified (amino-allyl) nucleotide method (available e.g. from Ambion), 
and the primer tagging method (DNA dendrimer labelling, as kit available e.g. 
from Genisphere). Particularly preferred for the present invention is the use of 
biotin or biotinylated nucleotides for labelling, with the latter being directly 

10 incorporated into, e.g. the cRNA polynucleotide by in vitro transcription. 

If the polynucleotide is mRNA, cDNA may be prepared into which a detectable 
label, as exemplified above, is incorporated. Said detectably labelled cDNA, in 
single-stranded form, may then be hybridised, preferably under stringent or highly 

15 stringent conditions to a panel of single-stranded oligonucleotides representing 
different genes and affixed to a solid support such as a chip. Upon applying 
appropriate washing steps, those cDNAs will be detected or quantitatively detected 
that have a counterpart in the oligonucleotide panel. Various advantageous 
embodiments of this general method are feasible. For example, the mRNA or the 

20 cDNA may be amplified e.g. by polymerase chain reaction, wherein it is 
preferable, for quantitative assessments, that the number of amplified copies 
corresponds relative to further amplified mRNAs or cDNAs to the number of 
mRNAs originally present in the cell. In a preferred embodiment of the present 
invention, the cDNAs are transcribed into cRNAs prior to the hybridisation step 

25 wherein only in the transcription step a label is incorporated into the nucleic acid 
and wherein the cRNA is employed for hybridisation. Alternatively, the label may 
be attached subsequent to the transcription step. 

Similarly, proteins from a cell or tissue under investigation may be contacted with 
30 a panel of aptamers or of antibodies or fragments or derivatives thereof. The 
antibodies etc. may be affixed to a solid support such as a chip. Binding of proteins 
indicative of an AML subtype may be verified by binding to a detectably labelled 
secondary antibody or aptamer. For the labelling of antibodies, it is referred to 
Harlow and Lane, "Antibodies, a laboratory manual", CSH Press, 1988, Cold 
35 Spring Harbor. Specifically, a minimum set of proteins necessary for diagnosis of 
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all AML subtypes may be selected for creation of a protein array system to make 
diagnosis on a protein lysate of a diagnostic bone marrow sample directly. Protein 
Array Systems for the detection of specific protein expression profiles already are 
available (for example: Bio-Plex, BIORAD, Munchen, Germany). For this 
5 application preferably antibodies against the proteins have to be produced and 
immobilized on a platform e.g. glasslides or microtiterplates. The immobilized 
antibodies can be labelled with a reactant specific for the certain target proteins as 
discussed above. The reactants can include enzyme substrates, DNA, receptors, 
antigens or antibodies to create for example a capture sandwich immunoassay. 

10 

For reliably distinguishing ALL subtypes it is useful that the expression of more 
than one of the above defined markers is determined. As a criterion for the choice 
of markers, the statistical significance of markers as expressed in q or p values 
based on the concept of the false discovery rate is determined. In doing so, a 
15 measure of statistical significance called the q value is associated with each tested 
feature. The q value is similar to the p value, except it is a measure of significance 
in terms of the false discovery rate rather than the false positive rate (Storey JD and 
Tibshirani R. Proc.Natl.Acad.Sci., 2003, Vol. 100:9440-5. 

20 In a preferred embodiment of the present invention, markers as defined in Tables 
1.1-2.15 having a q-value of less than 3E-06, more preferred less than 1.5E-09, 
most preferred less than 1 .5E- 1 1 , are measured. 

Of the above defined markers, the expression level of at least two, preferably of ai 
25 least ten, more preferably of at least 25, most preferably of 50 of at least one of the 
Tables of the markers is determined. 

In another preferred embodiment, the expression level of at least 2, of at least 5, of 
at least 10 out of the markers having the numbers 1 - 10, 1-20, 1-40, 1-50 of at 
30 least one of the Tables 1 . 1 -2. 1 5 are measured. 

The level of the expression of the „marker", i.e. the expression of the 
polynucleotide is indicative of the ALL subtype of a cell or an organism. The level 
of expression of a marker or group of markers is measured and is compared with 
35 the level of expression of the same marker or the same group of markers from other 
cells or samples. The comparison may be effected in an actual experiment or in 
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silico. When the expression level also referred to as expression pattern or 
expression signature (expression profile) is measurably different, there is according 
to the invention a meaningful difference in the level of expression. Preferably the 
difference at least is 5 %, 10% or 20%, more preferred at least 50% or may even be 
5 as high as 75% or 100%. More preferred the difference in the level of expression is 
at least 200%, i.e. two fold, at least 500%, i.e. five fold, or at least 1000%, i.e. 10 
fold 

Accordingly, the expression level of markers expressed lower in a first subtype 
10 than in at least one second subtype, which differs from the first subtype, is at least 
5 %, 10% or 20%, more preferred at least 50% or may even be 75% or 100%, i.e. 
2-fold lower, preferably at least 10-fold, more preferably at least 50-fold, and most 
preferably at least 100-fold lower in the first subtype. On the other hand, the 
expression level of markers expressed higher in a first subtype than in at least one 
15 second subtype, which differs from the first subtype, is at least 5 %, 10% or 20%, 
more preferred at least 50% or may even be 75% or 100%, i.e. 2-fold higher, 
preferably at least 10-fold, more preferably at least 50-fold, and most preferably at 
least 100-fold higher in the first subtype. 

20 In another embodiment of the present invention, the sample is derived from an 
individual having leukaemia, preferably ALL. 

For the method of the present invention it is preferred if the polynucleotide the 
expression level of which is determined is in form of a transcribed polynucleotide. 

25 A particularly preferred transcribed polynucleotide is an mRNA, a cDNA and/or a 
cRNA, with the latter being preferred. Transcribed polynucleotides are isolated 
from a sample, reverse transcribed and/or amplified, and labelled, by employing 
methods well-known the person skilled in the art (see Example 3). In a preferred 
embodiment of the methods according to the invention, the step of determining the 

30 expression profile further comprises amplifying the transcribed polynucleotide. 

In order to determine the expression level of the transcribed polynucleotide by the 
method of the present invention, it is preferred that the method comprises 
hybridizing the transcribed polynucleotide to a complementary polynucleotide, or a 
35 portion thereof, under stringent hybridiration conditions, as described hereinafter. 
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The term "hybridizing" means hybridization under conventional hybridization 
conditions, preferably under stringent conditions as described, for example, in 
Sambrook, J., et aL, in "Molecular Cloning: A Laboratory Manual" (1989), Eds. J. 
Sambrook, E. F. Fritsch and T. Maniatis, Cold Spring Harbour Laboratory Press, 
5 Cold Spring Harbour, NY and the further definitions provided above. Such 
conditions are, for example, hybridization in 6x SSC, pH 7.0 / 0.1% SDS at about 
45°C for 18-23 hours, followed by a washing step with 2x SSC/0.1% SDS at 50°C. 
In order to select the stringency, the salt concentration in the washing step can for 
example be chosen between 2x SSC/0.1% SDS at room temperature for low 

10 stringency and 0.2x SSC/0.1% SDS at 50°C for high stringency. In addition, the 
temperature of the washing step can be varied between room temperature, ca. 22°C, 
for low stringency, and 65°C to 70° C for high stringency. Also contemplated are 
polynucleotides that hybridize at lower stringency hybridization conditions. 
Changes in the stringency of hybridization and signal detection are primarily 

15 accomplished through the manipulation, preferably of formamide concentration 
(lower percentages of formamide result in lowered stringency), salt conditions, or 
temperature. For example, lower stringency conditions include an overnight 
incubation at 37°C in a solution comprising 6X SSPE (20X SSPE = 3M NaCl; 
0.2M NaH2P04; 0.02M EDTA, pH 7.4), 0.5% SDS, 30% formamide, 100 mg/ml 

20 salmon sperm blocking DNA, followed by washes at 50°C with 1 X SSPE, 0.1% 
SDS. In addition, to achieve even lower stringency, washes performed following 
stringent hybridization can be done at higher salt concentrations (e.g. 5x SSC). 
Variations in the above conditions may be accomplished through the inclusion 
and/or substitution of alternate blocking reagents used to suppress background in 

25 hybridization experiments. The inclusion of specific blocking reagents may require 
modification of the hybridization conditions described above, due to problems with 
compatibility. 

"Complementary" and "complementarity", respectively, can be described by the 
30 percentage, i.e. proportion, of nucleotides which can form base pairs between two 

polynucleotide strands or within a specific region or domain of the two strands. 

Generally, complementary nucleotides are, according to the base pairing rules, 

adenine and thymine (or adenine and uracil), and cytosine and guanine. 

Complementarity may be partial, in which only some of the nucleic acids 1 bases are 
35 matched according to the base pairing rules. Or, there may be a complete or total 

complementarity between the nucleic acids. The degree of complementarity 



WO 2005/045437 



PCT/EP2004/012459 



-16- 

between nucleic acid strands has effects on the efficiency and strength of 
hybridization between nucleic acid strands. 

Two nucleic acid strands are considered to be 100% complementary to each other 

5 over a defined length if in a defined region all adenines of a first strand can pair 
with a thymine (or an uracil) of a second strand, all guanines of a first strand can 
pair with a cytosine of a second strand, all thymine (or uracils) of a first strand can 
pair with an adenine of a second strand, and all cytosines of a first strand can pair 
with a guanine of a second strand, and vice versa. According to the present 

10 invention, the degree of complementarity is determined over a stretch of 20, 
preferably 25, nucleotides, i.e. a 60% complementarity means that within a region 
of 20 nucleotides of two nucleic acid strands 12 nucleotides of the first strand can 
base pair with 12 nucleotides of the second strand according to the above ruling, 
either as a stretch of 12 contiguous nucleotides or interspersed by non-pairing 

15 nucleotides, when the two strands are attached to each other over said region of 20 
nucleotides. The degree of complementarity can range from at least about 50% to 
full, i.e. 100% complementarity. Two single nucleic acid strands are said to be 
"substantially complementary" when they are at least about 80% complementary, 
preferably about 90% or higher. For carrying out the method of the present 

20 invention substantial complementarity is preferred. 

Preferred methods for detection and quantification of the amount of 
polynucleotides, i.e. for the methods according to the invention allowing the 
determination of the level of expression of a marker, are those described by 

25 Sambrook et al. (1989) or real time methods known in the art as the TaqMan® 
method disclosed in WO92/02638 and the corresponding U.S. 5,210,015, U.S. 
5,804,375, U.S. 5,487,972. This method exploits the exonuclease activity of a 
polymerase to generate a signal. In detail, the (at least one) target nucleic acid 
component is detected by a process comprising contacting the sample with an 

30 oligonucleotide containing a sequence complementary to a region of the target 
nucleic acid component and a labeled oligonucleotide containing a sequence 
complementary to a second region of the same target nucleic acid component 
sequence strand, but not including the nucleic acid sequence defined by the first 
oligonucleotide, to create a mixture of duplexes during hybridization conditions, 

35 wherein the duplexes comprise the target nucleic acid annealed to the first 
oligonucleotide and to the labeled oligonucleotide such that the 3 '-end of the first 
oligonucleotide is adjacent to the 5'-end of the labeled oligonucleotide. Then this 
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mixture is treated with a template-dependent nucleic acid polymerase having a 5* 
to 3' nuclease activity under conditions sufficient to permit the 5' to 3' nuclease 
activity of the polymerase to cleave the annealed, labeled oligonucleotide and 
release labeled fragments. The signal generated by the hydrolysis of the labeled 
5 oligonucleotide is detected and/ or measured TaqMan® technology eliminates the 
need for a solid phase bound reaction complex to be formed and made detectable. 
Other methods include e.g. fluorescence resonance energy transfer between two 
adjacently hybridized probes as used in the LightCycler® format described in U.S. 
6,174,670. 

10 

A preferred protocol if the marker, i.e. the polynucleotide, is in form of a 
transcribed nucleotide, is described in Example 3, where total RNA is isolated, 
cDNA and, subsequently, cRNA is synthesized and biotin is incorporated during 
the transcription reaction. The purified cRNA is applied to commercially available 
15 arrays which can be obtained e.g. from Affymetrix. The hybridized cRNA is 
detected according to the methods described in Example 3. The arrays are produced 
by photolithography or other methods known to experts skilled in the art e.g. from 
U.S. 5,445,934, U.S. 5,744,305, U.S. 5,700,637, U.S. 5,945,334 and EP 0 619 321 
or EP 0 373 203, or as decribed hereinafter in greater detail. 

20 

In another embodiment of the present invention, the polynucleotide or at least one 
of the polynucleotides is in form of a polypeptide. In another preferred 
embodiment, the expression level of the polynucleotides or polypeptides is detected 
using a compound which specifically binds to the polynucleotide of the polypeptide 
25 of the present invention. 

As used herein, "specifically binding" means that the compound is capable of 
discriminating between two or more polynucleotides or polypeptides, i.e. it binds to 
the desired polynucleotide or polypeptide, but essentially does not bind 
30 unspecifically to a different polynucleotide or polypeptide. 

The compound can be an antibody, or a fragment thereof, an enzyme, a so-called 
small molecule compound, a protein-scaffold, preferably an anticalin. In a 
preferred embodiment, the compound specifically binding to the polynucleotide or 
35 polypeptide is an antibody, or a fragment thereof. 
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As used herein, an "antibody" comprises monoclonal antibodies as first described 
by Kohler and Milstein in Nature 278 (1975), 495-497 as well as polyclonal 
antibodies, i.e. entibodies contained in a polyclonal antiserum. Monoclonal 
antibodies include those produced by transgenic mice. Fragments of antibodies 

5 include F(ab*)2> Fab and Fv fragments. Derivatives of antibodies include scFvs, 
chimeric and humanized antibodies. See, for example Harlow and Lane, loc. cit. 
For the detection of polypeptides using antibodies or fragments thereof, the person 
skilled in the art is aware of a variety of methods, all of which are included in the 
present invention. Examples include immunoprecipitation, Western blotting, 

10 Enzyme-linked immuno sorbent assay (ELISA), Enzyme-linked immuno sorbent 
assay (RIA), dissociation-enhanced lanthanide fluoro immuno assay (DELFIA), 
scintillation proximity assay (SPA). For detection, it is desirable if the antibody is 
labelled by one of the labelling compounds and methods described supra. 

15 In another preferred embodiment of the present invention, the method for 
distinguishing immunologically defined ALL subtypes is carried out on an array. 

In general, an "array" or "microarray" refers to a linear or two- or three 
dimensional arrangement of preferably discrete nucleic acid or polypeptide probes 

20 which comprises an intentionally created collection of nucleic acid or polypeptide 
probes of any length spotted onto a substrate/solid support. The person skilled in 
the art knows a collection of nucleic acids or polypeptide spotted onto a 
substrate/solid support also under the term "array". As known to the person skilled 
in the art, a microarray usually refers to a miniaturised array arrangement, with the 

25 probes being attached to a density of at least about 10, 20, 50, 100 nucleic acid 
molecules referring to different or the same genes per cm 2 . Furthermore, where 
appropriate an array can be referred to as "gene chip". The array itself can have 
different formats, e.g. libraries of soluble probes or libraries of probes tethered to 
resin beads, silica chips, or other solid supports. 

30 

The process of array fabrication is well-known to the person skilled in the art. In 
the following, the process for preparing a nucleic acid array is described. 
Commonly, the process comprises preparing a glass (or other) slide (e.g. chemical 
treatment of the glass to enhance binding of the nucleic acid probes to the glass 
35 surface), obtaining DNA sequences representing genes of a genome of interest, and 
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spotting sequences these sequences of interest onto glass slide. Sequences of 
interest can be obtained via creating a cDNA library from an mRNA source or by 
using publicly available databases, such as GeneBank, to annotate the sequence 
information of custom cDNA libraries or to identify cDNA clones from previously 
5 prepared libraries. Generally, it is recommendable to amplify obtained sequences 
by PCR in order to have sufficient amounts of DNA to print on the array. The 
liquid containing the amplified probes can be deposited on the array by using a set 
of microspotting pins. Ideally, the amount deposited should be uniform. The 
process can further include UV-crosslinking in order to enhance immobilization of 
1 0 the probes on the array. 

In a preferred embodiment, the array is a high density oligonucleotide (oligo) array 
using a light-directed chemical synthesis process, employing the so-called 
photolithography technology. Unlike common cDNA arrays, oligo arrays 

15 (according to the Affymetrix technology) use a single-dye technology. Given the 
sequence information of the markers, the sequence can be synthesized directly onto 
the array, thus, bypassing the need for physical intermediates, such as PCR 
products, required for making cDNA arrays. For this purpose, the marker, or partial 
sequences thereof, can be represented by 14 to 20 features, preferably by less than 

20 14 features, more preferably less than 10 features, even more preferably by 6 
features or less, with each feature being a short sequence of nucleotides 
(oligonucleotide), which is a perfect match (PM) to a segment of the respective 
gene. The PM oligonucleotide are paired with mismatch (MM) oligonucleotides 
which have a single mismatch at the central base of the nucleotide and are used as 

25 "controls". The chip exposure sites are defined by masks and are deprotected by 
the use of light, followed by a chemical coupling step resulting in the synthesis of 
one nucleotide. The masking, light deprotection, and coupling process can then be 
repeated to synthesize the next nucleotide, until the nucleotide chain is of the 
specified length. 

30 

Advantageously, the method of the present invention is carried out in a robotics 
system including robotic plating and a robotic liquid transfer system, e.g. using 
microfluidics, i.e. channelled structured. 

35 A particular preferred method according to the present invention is as follows: 

1. Obtaining a sample, e.g. bone marrow or peripheral blood aliquots, from a 
patient having ALL 
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2. Extracting RNA, preferably mRNA, from the sample 

3. Reverse transcribing the RNA into cDNA 

4. In vitro transcribing the cDNA into cRNA 

5. Fragmenting the cRNA 

5 6. Hybridizing the fragmented cRNA on standard microarrays 
7. Determining hybridization 

In another embodiment, the present invention is directed to the use of at least one 
marker selected from the markers identifiable by their Affymetrix Identification 

10 Numbers (affy id) as defined in Tables 1, and/or 2 for the manufacturing of a 
diagnostic for distinguishing immunologically defined ALL subtypes. The use of 
the present invention is particularly advantageous for distinguishing 
immunologically defined ALL subtypes in an individual having ALL. The use of 
said markers for diagnosis of immunologically defined leukemia subtypes, 

15 preferably based on microarray technology, offers the following advantages: (1) 
more rapid and more precise diagnosis, (2) easy to use in laboratories without 
specialized experience, (3) abolishes the requirement for analyzing viable cells for 
chromosome analysis (transport problem), and (4) very experienced hematologists 
for cytomorphology and cytochemistry, immunophenotyping as well as 

20 cytogeneticists and molecularbiologists are no longer required. 

Accordingly, the present invention refers to a diagnostic kit containing at least one 
marker selected from the markers identifiable by their Affymetrix Identification 
Numbers (affy id) as defined in Tables 1, and/or 2 for distinguishing 

25 immunologically defined ALL subtypes, in combination with suitable auxiliaries. 

Suitable auxiliaries, as used herein, include buffers, enzymes, labelling compounds, 
and the like. In a preferred embodiment, the marker contained in the kit is a nucleic 
acid molecule which is capable of hybridizing to the mRNA corresponding to at 
least one marker of the present invention. Preferably, the at least one nucleic acid 

30 molecule is attached to a solid support, e.g. a polystyrene microtiter dish, 
nitrocellulose membrane, glass surface or to non-immobilized particles in solution. 

In another preferred embodiment, the diagnostic kit contains at least one reference 
for a Pro-B-ALL, c-ALL, Pre-B-ALL, c-ALL/Pre-B-ALL, mature B-ALL, 
35 precursor B-ALL, Pro-T-ALL, Pre-T-ALL, cortical T-ALL, mature T-ALL, and/or 
T-ALL subtype. As used herein, the reference can be a sample or a data bank. 
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In another embodiment, the present invention is directed to an apparatus for 
distinguishing immunologically defined AML subtypes subtypes Pro-B-ALL, c- 
ALL, Pre-B-ALL, c-ALL/Pre-B-ALL, mature B-ALL, precursor B-ALL, Pro-T- 
ALL, Pre-T-ALL, cortical T-ALL, mature T-ALL, and/or T-ALL in a sample, 
5 containing a reference data bank obtainable by comprising 

(a) compiling a gene expression profile of a patient sample by determining 
the expression level at least one marker selected from the markers 
identifiable by their Affymetrix Identification Numbers (affy id) as 
defined in Tables 1 , and/or 2, and 
10 (b) classifying the gene expression profile by means of a machine learning 

algorithm. 

According to the present invention, the "machine learning algorithm" is a 
computational-based prediction methodology, also known to the person skilled in 

15 the art as "classifier", employed for characterizing a gene expression profile. The 
signals corresponding to a certain expression level which are obtained by the 
microarray hybridization are subjected to the algorithm in order to classify the 
expression profile. Supervised learning involves "training" a classifier to recognize 
the distinctions among classes and then "testing" the accuracy of the classifier on 

20 an independent test set. For new, unknown sample the classifier shall predict into 
which class the sample belongs. 

Preferably, the machine learning algorithm is selected from the group consisting of 
Weighted Voting, K-Nearest Neighbors, Decision Tree Induction, Support Vector 
25 Machines (SVM), and Feed-Forward Neural Networks. Most preferably, the 
machine learning algorithm is Support Vector Machine, such as polynomial kernel 
and Gaussian Radial Basis Function-kernel SVM models. 

The classification accuracy of a given gene list for a set of microarray experiments 
30 is preferably estimated using Support Vector Machines (SVM), because there is 
evidence that SVM-based prediction slightly outperforms other classification 
techniques like k-Nearest Neighbors (k-NN). The LIBSVM software package 
version 2.36 was used (SVM-type: C-SVC, linear kernel 
(http://www.csie.ntu.edu.^ The skilled artisan is furthermore 

35 referred to Brown et al., Proc.Natl.Acad.Sci., 2000; 97: 262-267, Furey et al., 
Bioinformatics. 2000; 16: 906-914, and Vapnik V. Statistical Learning Theory. 
New York: Wiley, 1998. 
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In detail, the classification accuracy of a given gene list for a set of microarray 
experiments can be estimated using Support Vector Machines (SVM) as supervised 
learning technique. Generally, SVMs are trained using differentially expressed 

5 genes which were identified on a subset of the data and then this trained model is 
employed to assign new samples to those trained groups from a second and 
different data set Differentially expressed genes were identified applying ANOVA 
and t-test-statistics (Welch t-test). Based on identified distinct gene expression 
signatures respective training sets consisting of 2/3 of cases and test sets with 1/3 

10 of cases to assess classification accuracies are designated. Assignment of cases to 
training and test set is randomized and balanced by diagnosis. Based on the training 
set a Support Vector Machine (SVM) model is built. 

According to the present invention, the apparent accuracy, i.e. the overall rate of 
15 correct predictions of the complete data set was estimated by iOfold cross 
validation. This means that the data set was divided into 10 approximately equally 
sized subsets, an SVM-model was trained for 9 subsets and predictions were 
generated for the remaining subset. This training and prediction process was 
repeated 10 times to include predictions for each subset. Subsequently the data set 
20 was split into a training set, consisting of two thirds of the samples, and a test set 
with the remaining one third. Apparent accuracy for the training set was estimated 
by IOfold cross validation (analogous to apparent accuracy for complete set). A 
SVM-model of the training set was built to predict diagnosis in the independent test 
set, thereby estimating true accuracy of the prediction model. This prediction 
25 approach was applied both for overall classification (multi-class) and binary 
classification (diagnosis X => yes or no). For the latter, sensitivity and specificity 
were calculated: 

Sensitivity = (number of positive samples predicted)/(number of true positives) 
Specificity = (number of negative samples predicted)/(number of true negatives) 

30 

In a preferred embodiment, the reference data bank is backed up on a 
computational data memory chip which can be inserted in as well as removed from 
the apparatus of the present invention, e.g. like an interchangeable module, in order 
to use another data memory chip containing a different reference data bank. 



35 
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The apparatus of the present invention containing a desired reference data bank can 
be used in a way such that an unknown sample is, first, subjected to gene 
expression profiling, e.g. by microarray analysis in a manner as described supra or 
in the art, and the expression level data obtained by the analysis are, second, fed 
5 into the apparatus and compared with the data of the reference data bank obtainable 
by the above method. For this purpose, the apparatus suitably contains a device for 
entering the expression level of the data, for example a control panel such as a 
keyboard. The results, whether and how the data of the unknown sample fit into the 
reference data bank can be made visible on a provided monitor or display screen 
10 and, if desired, printed out on an incorporated of connected printer. 

Alternatively, the apparatus of the present invention is equipped with particular 
appliances suitable for detecting and measuring the expression profile data and, 
subsequently, proceeding with the comparison with the reference data bank. In this 
15 embodiment, the apparatus of the present invention can contain a gripper arm 
and/or a tray which takes up the microarray containing the hybridized nucleic 
acids. 

In another embodiment, the present invention refers to a reference data bank for 
20 distinguishing immunologically defined ALL subtypes Pro-B-ALL, c-ALL, Pre-B- 
ALL, c-ALL/Pre-B-ALL, mature B-ALL, precursor B-ALL, Pro-T-ALL, Pre-T- 
ALL, cortical T-ALL, mature T-ALL, and/or T-ALL in a sample obtainable by 
comprising 

(a) compiling a gene expression profile of a patient sample by determining 
25 the expression level of at least one marker selected from the markers 

identifiable by their Affymetrix Identification Numbers (affy id) as 
defined in Tables 1, and/or and 

(b) classifying the gene expression profile by means of a machine learning 
algorithm. 

30 

Preferably, the reference data bank is backed up and/or contained in a 
computational memory data chip. 
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The invention is further illustrated in the following table and examples, without 
limiting the scope of the invention: 

TABLES 1.1-2.15 

Tables 1.1-2.15 show ALLL subtype analysis of subtypes Pro-B-ALL, c-ALL, Pre- 
B-ALL, c-ALL/Pre-B-ALL, mature B-ALL, precursor B-ALL, Pro-T-ALL, Pre-T- 
ALL, cortical T-ALL, mature T-ALL, and/or T-ALL. The analysed markers are 
ordered according to their q-values, beginning with the lowest q-values. 
For convenience and a better understanding, Tables 1.1 to 2.15 are accompanied 
with explanatory tables (Table 1.1 A to 2. 15 A) where the numbering and the 
Affymetrix Id are further defined by other parameters, e.g. gene bank accession 
number. 

EXAMPLES 

Example 1: General experimental design of the invention and results 

Acute lymphoblastic leukemia (ALL) is a heterogeneous group of diseases which 
are classified immunologically. Most of the clinically relevant subgroups are 
characterized by specific genetic translocations, i.e. translocations involving MLL 
(tMLL) in Pro-B-ALL, t(9;22) in c-ALL and Pre-B-ALL, and t(8;14) in mature B- 
ALL. While in childhood ALL gene expression profiling revealed specific gene 
signatures in cytogenetically defined subgroup the respective data are scarce in 
adult ALL and, in particular, it is not known if the immunologically defined 
subtypes of ALL which lack specific cytogenetic aberrations display a 
characteristic gene expression profile. We analyzed global gene expression 
signatures in bone marrow samples from 95 patients with newly diagnosed ALL by 
use of microarray technology (Pro-B-ALL n=18, c-ALL n=18, Pre-B-ALL n=5, c- 
ALL/Pre-B-ALL n=12, mature B-ALL n=ll, precursor B-ALL n=3, Pro-T-ALL 
n=2, Pre-T-ALL n=8, cortical T-ALL n=14, mature T-ALL n=2, T-ALL n=2). The 
diagnosis was based on cytomorphology, immunophenotyping, and cytogenetic 
and molecular genetic analyses. All samples were hybridized onto U133 set 
microarrays (Affymetrix) representing >30,000 human transcripts. Differentially 
expressed genes were identified applying ANOVA and t-test-statistics (Welch 
ttest). To assess the false discovery rate we calculated q-values according to Storey 
et al., PNAS 2003. Moreover, based on identified distinct gene expression 
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signatures we designated respective training sets consisting of 2/3 of cases and test 
samples with 1/3 of cases to assess classification accuracies. Assignment of cases 
to training and test set was randomized and balanced by diagnosis. Based on the 
training set we built a Support Vector Machine (SVM) model. Classification 
5 accuracy was assessed in the test set. In a first step, precursor B-ALL and precursor 
T-ALL were distinguished in 31 independent test samples with an accuracy of 
100%. In a second step samples were separated according to the EGIL 
classification (Pro-B-ALL, c-ALL, Pre-B-ALL, mature B-ALL, Pre-T-ALL, 
cortical T-ALL). Out of the 25 test samples 20 were classified correctly (accuracy: 

10 80%). Samples misclassified were: c-ALL as Pre-B-ALL (n=2), c-ALL as mature 
B-ALL, cortical T-ALL as Pre-T-ALL, and Pre-B-ALL as mature B-ALL (one 
each). Samples with c-ALL and Pre-B-ALL were then further subgrouped 
genetically according to positivity/negativity for t(9;22). Out of 29 test samples 24 
were classified correctly (accuracy: 83%). Sample misclassified were: c-ALL/Pre- 

15 B-ALL without t(9;22) as Pro-B-ALL and mature B-ALL (one each), c-ALL/Pre- 
B-ALL with t(9;22) as c-ALL/Pre-B-ALL without t(9;22) and mature B-ALL (one 
each), Pre-T-ALL as cortical T-ALL. These data demonstrate that distinct 
immunologically defined subtypes of ALL are characterized by specific gene 
expression profiles. Distinction between Tlineage and B-lineage disease is 

20 accomplished with 100% accuracy while misclassification occurs in cases 
belonging to subtypes closely related to each other with regard to the maturation 
status. Gene expression profiling of ALL may help to optimize diagnostics of ALL 
and to allow further insights into the pathogenesis of the biologically defined 
subgroups. 

25 

Example 2: General materials, methods and definitions of functional 
annotations 

The methods section contains both information on statistical analyses used for 
30 identification of differentially expressed genes and detailed annotation data of 
identified microarray probesets. 

Affvmetrix Probeset Annotation 

All annotation data of GeneChip® arrays are extracted from the NetAffx™ 
35 Analysis Center (internet website: www.affymetrix.com). Files for U133 set arrays, 
including U133A and U133B microarrays are derived from the June 2003 release. 
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The original publication refers to: Liu G, Loraine AE, Shigeta R, Cline M, Cheng J, 
Valmeekam V, Sun S, Kulp D, Siani-Rose MA. NetAflx: Affymetrix probesets and 
annotations. Nucleic Acids Res. 2003;31(l):82-6. 

5 The sequence data are omitted due to their large size, and because they do not 
change, whereas the annotation data are updated periodically, for example new 
information on chromomal location and functional annotation of the respective 
gene products. Sequence data are available for download in the NetAflx Download 
Center (www.affymetrix.com) 

10 

Data fields: 

In the following section, the content of each field of the data files are described. 
Microarray probesets, for example found to be differentially expressed between 
different types of leukemia samples are further described by additional information. 
15 The fields are of the following types: 

1. GeneChip Array Information 

2. Probe Design Information 

3. Public Domain and Genomic References 

20 

1 . GeneChip Array Information 
HG-U133 ProbeSetJD: 

HG-U133 ProbeSetJOD describes the probe set identifier. Examples are: 
25 200007_at, 20001 l_s_at, 200012_x_at 

GeneChip: 

The description of the GeneChip probe array name where the respective probeset is 
represented. Examples are: Affymetrix Human Genome U133A Array or 
30 Affymetrix Human Genome Ul 33B Array. 

2. Probe Design Information 
Sequence Type: 

35 The Sequence Type indicates whether the sequence is an Exemplar, Consensus or 
Control sequence. An Exemplar is a single nucleotide sequence taken directly from 
a public database. This sequence could be an mRNA or EST. A Consensus 
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sequence, is a nucleotide sequence assembled by Affymetrix, based on one or more 
sequence taken from a public database. 

Transcript ID: 

5 The cluster identification number with a sub-cluster identifier appended. 
Sequence Derived From: 

The accession number of the single sequence, or representative sequence on which 
the probe set is based. Refer to the "Sequence Source" field to determine the 
10 database used. 

Sequence ID: 

For Exemplar sequences: Public accession number or GenBank identifier. For 
Consensus sequences: Affymetrix identification number or public accession 
15 number. 

Sequence Source: 

The database from which the sequence used to design this probe set was taken. 
Examples are: GenBank®, RefSeq, UniGene, TIGR (annotations from The 
20 Institute for Genomic Research). 

3. Public Domain and Genomic References 

Most of the data in this section come from LocusLink and UniGene databases, and 
25 are annotations of the reference sequence on which the probe set is modeled. 

Gene Symbol and Title: 

A gene symbol and a short title, when one is available. Such symbols are assigned 
by different organizations for different species. Affymetrix annotational data come 
30 from the UniGene record. There is no indication which species-specific databank 
was used, but some of the possibilities include for example HUGO: The Human 
Genome Organization. 

MapLocation: 

35 The map location describes the chromosomal location when one is available. 
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Unigene__Accession: 

UniGene accession number and cluster type. Cluster type can be "full length" or 
"est", or " — " if unknown. 

5 LocusLink: 

This information represents the LocusLink accession number. 

Full Length Ref. Sequences: 

Indicates the references to multiple sequences in RefSeq. The field contains the ID 
10 and description for each entry, and there can be multiple entries per probeSet. 

Example 3: Sample preparation, processing and data analysis 

Method 1: 

15 Microarray analyses were performed utilizing the GeneChip® System (Affymetrix, 
Santa Clara, USA). Hybridization target preparations were performed according to 
recommended protocols (Affymetrix Technical Manual). In detail, at time of 
diagnosis, mononuclear cells were purified by Ficoll-Hypaque density 
centrifugation. They had been lysed immediately in RLT buffer (Qiagen, Hilden, 

20 Germany), frozen, and stored at -80°C from 1 week to 38 months. For gene 
expression profiling cell lysates of the leukemia samples were thawed, 
homogenized (QIAshredder, Qiagen), and total RNA was extracted (RNeasy Mini 
Kit, Qiagen). Subsequently, 5-10 pg total RNA isolated from 1 x 10 7 cells was 
used as starting material for cDNA synthesis with oligo[(dT)24T7promotor]65 

25 primer (cDNA Synthesis System, Roche Applied Science, Mannheim, Germany). 

cDNA products were purified by phenol/cUorophorm/IAA extraction (Ambion, 
Austin, USA) and acetate/ethanol-precipitated overnight. For detection of the 
hybridized target nucleic acid biotin-labeled ribonucleotides were incorporated 
during the following in vitro transcription reaction (Enzo BioArray HighYield 

30 RNA Transcript Labeling Kit, Enzo Diagnostics). After quantification by 
spectrophotometry measurements and 260/280 absorbance values assessment for 
quality control of the purified cRNA (RNeasy Mini Kit, Qiagen), 15 pg cRNA was 
fragmented by alkaline treatment (200 mM Tris-acetate, pH 8.2/500 mM potassium 
acetate/150 mM magnesium acetate) and added to the hybridization cocktail 

35 sufficient for five hybridizations on standard GeneChip microarrays (300 pi final 
volume). Washing and staining of the probe arrays was performed according to the 
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recommended Fluidics Station protocol (EukGE-WS2v4). Affymetrix Microarray 
Suite software (version 5.0.1) extracted fluorescence signal intensities from each 
feature on the microarrays as detected by confocal laser scanning according to the 
manufacturer's recommendations. 

5 

Expression analysis quality assessment parameters included visiual array 
inspection of the scanned image for the presence of image artifacts and correct grid 
alignment for the identification of distinct probe cells as well as both low 375' 
ratio of housekeeping controls (mean: 1.90 for GAPDH) and high percentage of 

10 detection calls (mean: 46.3% present called genes). The 3* to 5' ratio of GAPDH 
probesets can be used to assess RNA sample and assay quality. Signal values of the 
3' probe sets for GAPDH are compared to the Signal values of the corresponding 
5* probe set. The ratio of the 3' probe set to the 5' probe set is generally no more 
than 3.0. A high 3' to 5' ratio may indicate degraded RNA or inefficient synthesis 

15 of ds cDNA or biotinylated cRNA (GeneChip® Expression Analysis Technical 
Manual, www.affymetrix.com). Detection calls are used to determine whether the 
transcript of a gene is detected (present) or undetected (absent) and were calculated 
using default parameters of the Microarray Analysis Suite MAS 5.0 software 
package. 

20 

Method 2: 

Bone marrow (BM) aspirates are taken at the time of the initial diagnostic biopsy 
and remaining material is immediately lysed in RLT buffer (Qiagen), frozen and 
stored at -80 C until preparation for gene expression analysis. For microarray 

25 analysis the GeneChip System (Affymetrix, Santa Clara, CA, USA) is used. The 
targets for GeneChip analysis are prepared according to the current Expression 
Analysis. Briefly, frozen lysates of the leukemia samples are thawed, homogenized 
(QIAshredder, Qiagen) and total RNA extracted (RNeasy Mini Kit, 
Qiagen).Normally 10 ug total RNA isolated from 1 x 107 cells is used as starting 

30 material in the subsequent cDNA-Synthesis using 01igo-dT-T7-Promotor Primer 
(cDNA synthesis Kit, Roche Molecular Biochemicals). The cDNA is purified by 
phenol-chlorophorm extraction and precipitated with 100% Ethanol over night. For 
detection of the hybridized target nucleic acid biotin-labeled ribonucleotides are 
incorporated during the in vitro transcription reaction (Enzo® BioArray™ 

35 HighYield™ RNA Transcript Labeling Kit, ENZO). After quantification of the 
purified cRNA (RNeasy Mini Kit, Qiagen), 15 ug are fragmented by alkaline 
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treatment (200 mM Tris-acetate, pH 8.2, 500 mM potassium acetate, 150 mM 
magnesium acetate) and added to the hybridization cocktail sufficient for 5 
hybridizations on standard GeneChip microarrays. Before expression profiling 
Test3 Probe Arrays (Affymetrix) are chosen for monitoring of the integrity of the 
5 cRNA. Only labeled cRNA-cocktails which showed a ratio of the messured 
intensity of the 3' to the 5 ! end of the GAPDH gene less than 3.0 are selected for 
subsequent hybridization on HG-U133 probe arrays (Affymetrix). Washing and 
staining the Probe arrays is performed as described (siehe Affymetrix-Original- 
Literatur (LOCKHART und LIPSHUTZ). The Affymetrix software (Microarray 
10 Suite, Version 4.0.1) extracted fluorescence intensities from each element on the 
arrays as detected by confocal laser scanning according to the manufacturers 
recommendations. 
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31 232204_at 


EBF 


-43.62 


r- -tap ^ a 

5.79E-12 


4.52E-09 


-0.90 


-8.1 u oqo4 


32 204446_s_at 


ALOX5 


-7.54 


1 .25E-1 1 


9.15E-09 


-0.87 


O ftft ACXnA A O 

-8.0U 1Uql 1.2 


33 210146_x_at 


LILRB2 


T AA 

-7.96 


H rv A C A A 

1 .01 E-1 1 


7.63E-09 -0.86 


•7 on a c\r%A o >i 

-/.yy iyqio.4 


A A A A^A A""» J 

34 207697_x_at 


LILRB2 


il O il 

-4.84 


6.33E-10 


2.65E-07 -0.91 


*7 ftQ d Q/mi O A 

-7.98 19qlo.4 


35 207467_x_at 


CAST 


-2.91 


9.65E-09 


2.26E-06 


-0.95 


7 ftQ C«<4C «OH 

-f .yo oqio-q^i 


36 226878_ at 




o no 

-3.93 


yf ft iii— a r\ 
4.94 E- 10 


2.22E-07 -0.90 


-/ ,y4 


37 208651_x_at 


CD24 


-6.55 


a Aip* i a 

3.91 E-10 


1.79E-07 


-0.88 


-7.83 oq21 


AA 4^4^^ A Jl J. 

38 205640_at 


ai t^i i o n <i 

ALDH3B1 


-5.51 


A AAp A A 

1 .36E-1 1 


9.63E-09 


-0.84 


7 Q A A A r*A O 

-f.oi nqio 


39 222701_s_at 


MGC2217 


-3.28 


6.51 E-09 


1.77E-06 


-0.90 


-7.75 8q11.23 


40 201161_s_at 


CSDA 


-4.98 


3.70E-07 


4.33E-05 


-1.00 


-7.73 12p13.1 


41 227646_at 




-22.93 


3.36E-11 


2.25E-08 


-0.85 


-7.70 


42 205049_s_at 


CD79A 


-7.16 


2.78E-11 


1.91E-08 


-0.82 


-7.66 19q13.2 


43 224796_at 


DDEF1 


-2.01 


6.93E-11 


4.38E-08 -0.83 


-7.64 8q24.1- 














q24.2 


44 203300_x_at 


AP1S2 


/-S AA 

-2.09 


A AOC A r\ 

1.06E-10 


6.52E-08 -0.83 


7 dA YnOO OH 

-/ .b4 Ap^.ol 


45 22221 7_s_at 


SLC27A3 


-3.67 


3.42E-10 


1.60E-07 


-0.85 


7 CO H«OH H 

-/ ,bo iq^il .1 


46 205101_at 


MHC2TA 


-6.89 


4.66E-1 1 


3.03E-08 


-0.81 


7 CQ H Ca'I O 

-7.58 1op13 


47 226459_at 


FLJ35564 


-3.89 


4.44E-09 


1.27E-06 


-0.87 


-7.57 10q23.33 


48 56256_at 


CGI-40 


-2.07 


1.80E-10 


9.58E-08 


-0.81 


-7.44 11q23.3 


49 204604_at 


PFTK1 


-4.97 


1.87E-10 


9.74E-08 


-0.80 


-7.40 7q21-q22 


50 208178_x_at 


TRIO 


-6.80 


1.09E-10 


6.52E-08 -0.80 


-7.38 5p15.1-p14 



WO 2005/045437 

37 
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# affvid 


HUGO name 


fc 


1 227353_at 


EVER2 


-3.51 


2 225637_at 


FU20186 


-5.19 


3 202853_s_at 


RYK 


-4.26 


4 204949_at 


ICAM3 


-5.58 
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-Z.O I 
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-O.OO 


1A 99G91A at 
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O.OU 
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I OfM*;9Aft^ 
LUU I 0Z400 


-ft 19 
-D. IZ 


1ft 9A91 H at 

I d o4z i u ai 


OL/VVOZ 


7 OA 


17 9flA99ft at 

i / zu4ozo_ai 


C\/CRi 


9 CiR 
-Z.UO 


1ft 9*1 9flft9 at 

lo z izuoo ai 


L-L/44 


9 

z.oy 


1Q 99ft7*>A at 


|/|A AH 71Q 
r\IM/A I / I y 
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-z. / o 


9H 91QAft9 at 

zu z i y40o ai 


r*90«-\rf 1 09 
OZUOiT I UO 




9*1 99ft7ftA at 

z 1 zzor o*t — ai 


I OP1^9AftR 
LVJO I0Z400 


1A HA 


99 9HQft99 e at 

zz zuyozz_s_ai 


\/l HI D 
VLULK 


7 99 
f .OZ 


99 991QftQ at 

zo zz I yoy ai 


rMAO 


A 99 
4.00 


9 A O AO AAA at 

Z4 Z4Z4 i4_ai 




A 9Q 

4.oy 


9R 909090 at 

zo zuouzu_ai 


k1AAOA71 


9 09 

-Z.UO 


26 227134_at 


JFC1 


-2.40 


27 203593_at 


CD2AP 


-4.02 


28 22591 2_at 


TP53INP1 


-5.53 


29 207734_at 


LAX 


-1.93 


30 215925_s_at 


CD72 


7.99 


31 218066_at 


SLC12A7 


1.98 


32 203725 at 


GADD45A 


-3.29 


33 219033_at 


FU21308 


3.84 


34 225703_at 


KIAA1545 


2.27 


35 228758_at 




-4.41 


36 200045_at - HG- 


ABCF1 


-1.65 


U133B 






3/ 217940_S_at 


FLJ10769 


-2.82 


38 203139_at 


DAPK1 


-4.55 


39 228083_at 


CACNA2D4 


7.35 


40 55872_at 


KIAA1196 


-3.43 


41 204794_at 


DUSP2 


-3.94 


42 217168_s_at 


HERPUD1 


-2.28 


43 219045 at 


ARHF 


-2.26 
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1 1ftP <fQ 

i . ice- iy 


1 ORP 1<^ 

1 .zot- 1 0 
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- I .Z/ 


1 1 ftft 17n9*> Q 

- I 1 .00 1 f qzo.o 
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1 9£P 1R 
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p13.2 


6.51 E-10 


4.52E-07 


1.73 


10.82 13q12.13 


9.06E-17 


3.90E-13 


-1.11 


-10.38 14 


5.02E-11 


5.15E-08 


1.27 


9.94 9p12 


4.36E-15 


1.34E-11 


-1.08 


-9.81 4p11 


3.84E-15 


1.34E-11 


-1.07 


-9.79 14 


1.51 E-08 


4.53E-06 


1.96 


9.61 2p14-p13 


5.75E-14 


1.55E-10 


-0.96 


-8.99 10q21-q22 


3.34E-13 


7.63E-10 


-0.95 


-8.78 3q22 


3.54E-13 


7.63E-10 


-0.95 


-8.71 1p36 


7.21 E-08 


1.46E-05 


1.49 


8.45 


1.21 E-1 2 


2.18E-09 


-0.91 


-8.43 4q31 .1 


1.06E-12 


2.08E-09 


-0.91 


-8.42 1p36 


6.17E-12 


9.64E-09 


-0.91 


-8.27 17q25.3 


2.51 E-08 


6.51 E-06 


1.16 


8.16 11p13 


6.03E-11 


5.90E-08 


-0.91 


-8.09 3p24-p23 


1.28E-07 


2.25E-05 


1.39 


8.09 20p12 


1.40E-11 


1.74E-08 


-0.90 


-8.02 4q31.1 


1.86E-07 


2.85E-05 


1.48 


8.01 9p24 


9.12E-08 


1.75E-05 


1.25 


7.98 9p1 3 


1.33E-07 


2.27E-05 


1.33 


7.98 


6 27E-12 


9 64E-09 


-0 86 


-7 98 1a24-a25 


A. AAtZ A A 
1 .41 1-1 1 


A 7 AC no 
1 . f 4t-Uo 


-U.00 


-r ,yo npoo.o 


1 AAtZ A A 


1 7AC Oft 


n ftft 
-u.oo 


7 OI ftr%1 9 

-1 .y 1 op iz 


A OCC A A 


A COC HQ 


-U.o4 


7 ft7 ft«99 

-f .Of oqZZ 


3.02E-11 


3.42E-08 


-0.85 


-7.80 1q32.1 


2.65E-07 


3.67E-05 


1.36 


7.74 9p13.1 


1.43E-08 


4.35E-06 


0.97 


7.65 5p15 


3.30E-11 


3.55E-08 


-0.82 


-7.63 1p31.2- 








p31.1 


O. ft7C_A7 
0.0/ fc~U/ 


o.uot-uo 


l.oU 


7 f^1 Rn1 1 1 

/ .01 oq 1 1 .1 


*1 CCC A7 


Z.04t:-L/0 


a An 


7 Aft 19n9A 99 

#.4o izqz4.oo 


O **1 P 1 1 

y .0 1 1- 1 1 


ft 79P-Hft 


n ftfl 


-7 A9 


o.zyt-iu 


9 ft9P_H7 
Z.OZt-U ( 


-U.o I 


7 9ft ftr\91 99 
-/ .00 opz I .OO 


1.29E-10 


1.16E-07 


-0.79 


-7.34 13q33.3 


1.88E-10 


1.62E-07 


-0.78 


-7.24 9q34.1 


6.61 E-07 


7.66E-05 


1.22 


7.20 12p13.33 


3.05E-10 


2.53E-07 


-0.78 


-7.20 20q1 3.33 


3.70E-10 


2.84E-07 


-0.79 


-7.19 2q11 


6.84E-10 


4.60E-07 


-0.79 


-7.19 16q12.2-q13 


3.31 E-09 


1.45E-06 


-0.81 


-7.14 12q24.31 
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44 24361 8_s_at 


LOC1 52485 




q i^nc in 


j.OOC-Ur -U.OH 


-7 11 4a 31 1 


45 200965_s_at 


ABLIM1 


-4.68 


4.57E-10 


3.32E-07 -0.76 


-7.07 10q25 


46 22561 3_at 


KIAA0303 


-3.95 


4.63E-10 


3.32E-07 -0.76 


-7.05 5q12.3 


47 210024_s_at 


UBE2E3 


-3.55 


8.10E-10 


4.98E-07 -0.76 


-7.01 2q32.1 


48 203435_s_at 


MME 


-20.47 


1.51E-09 


7.93E-07 -0.81 


-6.98 3q25.1- 












q25.2 


49 219648_at 


FLJ10116 


-4.27 


7.10E-10 


4.63E-07 -0.75 


-6.97 2q35 


50 210424_s_at 


GOLGIN-67 


-4.01 


2.66E-09 


1.25E-06 -0.78 


-6.97 15q11.2 



Table 2 

2. All-Pairs (AP) 



2.1 ball versus cpre 



affy id 




fc d a stn 


t 


Map 








Location 


1 219506_at 


FU23221 


-3.45 9.55E-06 0.16997203 -1 


.54 


-6.75 1q21.2 


2 235509_at 


MGC40214 


1.55 4.30E-06 0.15843086 1 


.37 


6.38 8q22.1 


3 205006_s_at 


NMT2 


-3.42 1.38E-05 0.16997203 -1 


.36 


-6.17 10p13 


4 217979_at 


NET-6 


-7.33 0.00018505 0.30283227 -1 


.46 


-5.60 7p21.1 


5 225927_at 




2.49 2.07E-05 0.19100352 1 


.19 


5.57 


6 239835_at 


TA-KRP 


2.17 5.21 E-05 0.2517793 1 


.20 


5.43 3p14 


7 225606_at 


LOC150819 


3.84 7.49E-05 0.2517793 1 


.18 


5.32 2q12.3 


8 225557_at 


AXUD1 


-2.99 0.00010941 0.28818669 -1 


.21 


-5.32 3p22 


9 225455_at 


STAF42 


1.61 4.17E-05 0.2517793 1 


.13 


5.29 1q23.2 


10 212124_at 


RAM 7 


-3.26 6.41 E-05 0.2517793 -1 


.15 


-5.28 10q22.3 


11 221624_at 


TCL6 


2.44 5.38E-05 0.2517793 1 


1.14 


5.27 14q32.1 


12 225570_at 


SLC41A1 


-2.39 0.00014923 0.29947256 -' 


1.19 


-5.19 1q32.1 


13 229061_s_at 


SLC25A13 


2.04 0.00015302 0.29947256 


1.16 


5.12 7q21.3 


14 212841_s_at 


PPFIBP2 


9.54 0.00039491 0.30409429 


1 .34 


5.10 11p1 5.3 


15 218312_s_at 


FU12895 


-2.95 0.000104 0.28818669 -' 


1.12 


-5.09 19q13.43 


16 234107_s_at 


HARS2 


2.75 7.51 E-05 0.2517793 ' 


1 .09 


5.06 20p1 1.23 


17 203688_at 


PKD2 


-4.76 0.00040307 0.30409429 - 


1 .30 


-5.05 4q21-q23 


18 235353_at 


KIAA0746 


2.66 0.00010238 0.28818669 


1.10 


5.05 4p1 5.2 


19 209001_s_at 


DKFZP566D193 


1.76 6.89E-05 0.2517793 


1.07 


5.02 3q22.1 


20 211031_s_at 


CYLN2 


-1 5.41 0.00051 954 0.3041 0368 - 


1.34 


-4.96 7q1 1.23 


21 228153_at 


LOC255488 


5.56 0.00035957 0.30409429 


1.18 


4.92 6p22.3 


22 224450_s_at 


AD034 


1.74 0.00019836 0.30283227 


1.10 


4.91 6p24.3 


23 224654_at 




1 .57 0.0001 543 0.29947256 


1.07 


4.88 


24 201539_s_at 


FHL1 


-3.36 0.00046369 0.30409429 - 


1.21 


-4.88 Xq26 


25 236656_s_at 




-5.07 0.00055979 0.30521 1 74 - 


1.28 


-4.88 


26 221268_s_at 


SGPP1 


2.14 0.0002704 0.30283227 


1.11 


4.85 14q23.1 


27 218066_at 


SLC12A7 


-2.31 0.00012424 0.29947256 - 


1.04 


-4.84 5p15 


28 213541_s_at 


ERG 


-1 1 .53 0.00066791 0.30549286 - 


1.36 


-4.83 21q22.3 


29 210835_s_at 


CTBP2 


-3.1 1 0.00021 882 0.30283227 - 


1.06 


-4.79 10q26.2 


30 203859_s_at 


PALM 


-3.45 0.00026474 0.30283227 - 


1.07 


-4.76 19p13.3 


31 210298_x at 


FHL1 


-21.41 0.00076087 0.3277722 - 


1.37 


-4.76 Xq26 
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32 226005_at 




2.1 7 0.00030052 0.30409429 


1.08 


A 7C 

4.f O 


33 212313_at 


MGC29816 


3.25 0.00044995 0.30409429 


1.13 


>l Til Qr\Oi O 

4-/4 op/ 1 


34 202136_at 


BS69 


n —r r\ r\ r\r\f\*">t\A A A f\ O O A OO A OO 

-2.79 0.00039144 0.30409429 


A A f\ 

-1.10 


-4.73 10pi4 


35 220987_s_at 


SNARK 


1.99 0.00014577 0.299472oo 


A f\f\ 

1.00 


vl CO H«00 *1 


36 201220_x_at 


CTBP2 


-2.67 0.00026099 0.30283227 


A f\A 

-1.04 


-4.69 10q2b.^ 


37 203198_at 


CDK9 


-2.29 0.00019436 0.30283227 


A AO 

-1.02 


-4.DO 9qo4.1 


38 226271_at 




*\ — r r~ /\ nnnnoAon o 0000000*7 

2.75 0.00022029 0.30283227 


1.02 


/I AQ 

4. oo 


39 203664_s_at 


POLR2D 


1 .68 0.0002083 0.30283227 


A AO 

1.02 


4.68 2Q21 


40 212012_at 




-14.45 0.00086071 0.33453416 


-1.33 


-4.67 


41 232950_s_at 


NIR3 


3.01 0.00048654 0.30409429 


a A r\ 

1.10 


4.6b 12q24.o1 


42 203373_at 


SOCS2 


-25.42 0.00091929 0.33453416 


A O O 

-1.36 


>l Oil 4 0n 

-4.64 12q 


43 228390_at 




7.11 0.00073011 0.32052047 


H HO 

1.18 


A CA 

4.64 


44 212136_at 


ATP2B4 


-2.54 0.00037147 0.30409429 


-1.05 


il Oil A /-OC -Ol 

-4.64 1q2o-qo2 


45 209048_s_at 


PRKCBP1 


-1 .72 0.00023474 0.30283227 


A f\A 

-1.01 


-4.63 20ql o.l 2 


46 210644_s_at 


LAIR1 


-2.97 U.OUU^o/ iO U.olMCDO^sr 


1 Hi 
-1 .U 1 


_A ft9 iQnl A 
-*f.oz lyqio.H- 


47 235273_at 


EKN1 


4.18 0.0002388 0.30283227 


1.00 


4.61 15q21.1 


48 203622_s_at 


LOC56902 


1.61 0.00025645 0.30283227 


0.99 


4.57 2p1 3.2 


49 215222_x_at 


MACF1 


-2.43 0.00048313 0.30409429 


-1.05 


-4.57 1p32-p31 


50 242292_at 


MGC34827 


5.07 0.00055811 0.30521174 


1.07 


4.56 Xq1 3.1 



2.2 ball versus cpreph 



affy id 


HUGO name 


fc | 


1 203373_at 


S0CS2 


-36.06 


2 201029_s_at 


CD99 


-4.72 


3 210487_at 


DNTT 








373.56 


4 201540_at 


FHL1 


-13.76 


5 212012_at 




-19.01 


6 217979_at 


NET-6 


-7.66 


7 227584_at 




-6.64 


8 215537_x_at 


DDAH2 


-6.41 


9 209806_at 


HIST1H2BK 


-3.24 


10 203372_s_at 


SOCS2 


-51.95 


11 202123_s_at 


ABL1 


-3.32 


12 213056_at 


KIAA1013 


-5.29 


13 22471 0_at 


RAB34 


-6.65 


14 210299_s_at 


FHL1 


-16.35 


15 204663_at 


ME3 


-3.67 


16 20251 9_at 


MONDOA 


-3.70 


17 218589_at 


P2RY5 


-19.16 


18 206995_x_at 


SCARF1 


-3.07 


19 212013_at 


D2S448 








118.51 


20 219506_at 


FLJ23221 


-4.99 


21 209530_at 


CACNB3 


-4.68 


22 224772_at 


NAV1 


-6.79 



q stn t Map 

Location 



5.83E-14 


1.74E-09 


-3.09 


-15.85 12q 


1.47E-13 


2.20E-09 


-2.06 


-12.13 Xp22.32 


2.87E-11 


2.87E-07 


-2.41 


-11.84 10q23-q24 


1.19E-10 


8.91 E-07 


-1.98 


-10.66 Xq26 


1.12E-09 


2.40E-06 


-1.88 


-9.72 


3.59E-10 


1.19E-06 


-1.71 


-9.65 7p21.1 


6.56E-10 


1.78E-06 


-1.69 


-9.46 


2.41 E-10 


9.21 E-07 


-1.61 


-9.33 6p21 .3 


1.90E-10 


9.21 E-07 


-1.59 


-9.27 6p21.33 


3.20E-09 


5.64E-06 


-1.86 


-9.25 12q 


5.33E-10 


1.60E-06 


-1.57 


-9.08 9q34.1 


8.82E-10 


2.03E-06 


-1.59 


-9.05 3p14.1 


2.24E-10 


9.21 E-07 


-1.53 


-9.02 17q11.1 


4.90E-09 


7.72E-06 


-1.71 


-8.95 Xq26 


2.46E-10 


9.21 E-07 


-1.51 


-8.95 11cen-q22.3 


1.28E-09 


2.56E-06 


-1.56 


-8.89 12q21.31 


6.12E-09 


8.73E-06 


-1.71 


-8.86 13q14 


7.92E-10 


1.98E-06 


-1.47 


-8.66 17p13.3 


1.35E-08 


1.45E-05 


-1.73 


-8.55 2pter-p25.1 


4.22E-09 


7.02E-06 
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10 20961 9_at 


CD74 


6.21 


8.10E-06 0.04823332 


1.50 


6.45 5q32 


11 201137_s_at 


HLA-DPB1 


12.99 


6.54E-05 0.11029051 


1.64 


6.27 6p21.3 


12 201161_s_at 


CSDA 


6.18 


1.53E-05 0.06014434 


1.46 


6.25 12p13.1 


13 213293_s_at 


TRIM22 


2.45 


9.21 E-06 0.04823332 


1.43 


6.24 11p15 


14 208894_at 


HLA-DRA 


24.85 0.00010573 0.12304972 


1.76 


6.14 6p21.3 


15 203932_at 


HLA-DMB 


8.59 0.00010149 0.12304972 


1.65 


6.07 6p21.3 


16 226459_at 


FU35564 


3.96 


2.67E-05 0.08393232 


1.40 


5.95 10q23.33 


17 208651_x_at 


CD24 


9.40 


7.60E-05 0.11948472 


1.49 


5.93 6q21 


18 204446_s_at 


ALOX5 


13.16 0.00011774 0.12332854 


1.58 


5.91 10q11.2 
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I9 205088_at 


CXorf6 


2.94 3.09E-05 0.08508103 


1.38 


5.89 Xq28 


20 204639_at 


ADA 


-2.82 0.00015852 0.13388836 


-1.48 


-5.87 20q12- 










q13.11 


21 213142_x_at 


LOC54103 


5.86 0.00011393 0.12332854 


A f—O 

1.52 


5.84 /qn.M 


22 209312_x_at 


HLA-DRB1 


10.87 0.00019668 0.13783906 


1.56 


5.63 6p21 .3 


23 219202_at 


FLJ22341 


4.23 5.74E-05 0.10603287 


1.32 


5.61 17q25.3 


24 232594_at 




2.67 4.90E-05 0.10265345 


1.31 


5.61 


25 221969_at 


PAX5 


14.49 0.00021367 0.14256177 


1.57 


5.60 9p1 3 


26 205504_at 


BTK 


2.84 4.73E-05 0.10265345 


1.27 


5.50Xq21.33- 










q22 


27 211991_s_at 


HLA-DPA1 


21.40 0.00025046 0.1513475 


a 

1.53 


O.4o op^H .0 


28 213539_at 


CD3D 


-11.72 0.00078516 0.21686687 


A A 

-1.64 


CZ AC A A «00 

-5.46 11qzo 


29 215193_x_at 


HLA-DRB1 


17.00 0.00027739 0.16085873 


A CT 

1.57 


5.44 op21 .0 


30 211065_x_at 


PFKL 


2.50 9.23E-05 0.12304972 


A AA 

1.29 


5.42 2lq2z.o 


31 203721_s_at 


CGI-48 


-1.54 6.67E-05 0.11029051 


-1.25 


-5.39 17q21 .00 


32 212998_x_at 


HLA-DQB1 


23.06 0.00029381 0.16085873 


1.50 


C OT C»-kO*1 O 

5.37 opzl.o 


33 22291 5_s_at 


BANK 


2.85 0.00017388 0.13388836 


A O O 

1.33 


5.36 4qzo 


34 201721_s_at 


LAPTM5 


1.71 0.00013123 0.12886013 


1.27 


5.35 1p34 


35 233358_at 


FLJ 14311 


2.00 8.82E-05 0.12304972 


1.25 


5.33 19 


36 236745_at 


FLJ34512 


4.84 9.62E-05 0.12304972 


1.25 


5.31 16p13.3 


37 242292_at 


MGC34827 


-6.28 0.0005803 0.2003799 


-1.42 


-5.30 Xq13.1 


38 216705_s_at 


ADA 


-2.44 0.00016535 0.13388836 


-1.24 


-5.22 20q12- 










q13.11 


39 201160_s_at 


CSDA 


3.16 0.00021777 0.14256177 


A OC 

1.25 


C A O A A-J Q A 

5.1o 12p1o.l 


40 210844_x_at 


CTNNA1 


5.22 0.00010021 0.12304972 


a on 

1.20 


5.16 5qo1 


41 221978_at 


HLA-F 


2.55 0.00016221 0.13388836 


«f oo 

1.23 


C A O CrtO*! O 

5.13 6p21 .o 


42 236656_s_at 




6.98 0.00031739 0.16085873 


A O H 

1.31 


5.12 


43 211004_s_at 


ALDH3B1 


1.99 0.00012246 0.12413388 


1.18 


C AO A *-*4 O 

5.0o llqlo 


44 206662_at 


GLRX 


3.21 0.00010258 0.12304972 


1.16 


i- AC C« H >l 

5.05 oql4 


45 222292_at 


TNFRSF5 


7.66 0.00029289 0.16085873 


1.26 


5.05 20q12-q13.2 


46 202699_s_at 


KIAA0792 


2.56 0.00019823 0.13783906 


1.21 


5.04 1q42.13 


47 232095_at 




6.28 0.0001747 0.13388836 


1.18 


5.01 


48 228220_at 


LOC1 15548 


4.21 0.00035438 0.17265105 


1.27 


5.01 5q13.1 


49 226646_at 


KLF2 


2.63 0.00011379 0.12332854 


1.14 


4.98 19p13.13- 










p13.11 


50 220744_s_at 


WDR10 


-2.84 0.00031279 0.16085873 


-1.20 


-4.98 3q21 



2.9 cpre versus prob 



affy id 


HUGO name 


fc 


1 225563_at 


LOC255967 


-4.01 


2 204069_at 


MEIS1 


-19.17 


3 242414_at 




-5.64 


4 212063_at 


CD44 


-2.87 


5 208302_at 


HB-1 


-3.77 


6 204674_at 


LRMP 


-3.36 


7 201153_s_at 


MBNL1 


-1.65 


8 35974_at 


LRMP 


-3.51 



q stn t Map 

Location 

4.07E-10 1.22E-05 -1.81 -9.77 13q12.13 

1.33E-08 0.00013226 -1.93 -9.40 2p14-p13 

4.25E-08 0.00031756 -1.64 -8.37 

1.18E-08 0.00013226 -1.51 -8.17 11p13 

3.01 E-07 0.00151017 -1.33 -7.07 5q31.3 

6.49E-07 0.00188309 -1.37 -7.00 12p12.1 

3.03E-07 0.001 51017 -1 .31 -6.99 3q25 

9.53E-07 0.00219068 -1.34 -6.83 12p12.1 
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9 239214_at 




a nft 


q pop (xt n nni R9 aai; 


-1 97 


-6 83 


10 219033_at 


FLJ21308 


O. ftn 
-o.bU 


a pftp ot n nm 7A7Q 




-6 81 5a11 1 


M a f% jt*. A rs. A A x 

11 204044_at 


QPRT 


ft OQ 

-D.OO 


•1 ftftc nft n nnonn9PQ 
i.ooc-uo u.uuouuzoy 


-1 ^7 


-fi 76 16n12 1 


12 218847_at 


IMP-2 


-b.04 


a qqp n7 n nnippono 
o.yot-Uf u.uu looouy 




-fi 73 3a28 


a at*. A r~ ^ A. 

13 215925_s_at 


CD72 


-5.4U 


o. lyt-ur u.uu looouy 


-1 97 


.ft 70 Qn13 1 


14 201105_at 


LGALS1 


-10.23 


1.91E-06 0.00300289 


«i OO 

-1 .29 


-0.54 Z^q IO.I 


15 242172_at 




-6.31 


1.82E-06 0.00300289 


-1.27 


-6.52 


16 209822_s_at 


VLDLR 


-4.50 


7.57E-07 0.00188632 


-1.21 


-6.52 9p24 


17 232530_at 




-15.62 


3.74E-06 0.00520412 


-1.36 


-6.46 


18 205821_at 


D12S2489E 


-4.97 


1.29E-06 0.00275807 


-1.19 


-6.37 12p1 3.2- 












p12.3 


19 232231 _at 




-6.1 D 


A C1C ftG ft ftftQftftOPO 

i .b i t-Ub u.uuouuzoy 


- 1 .zu 


"\j.O\J 


20 209982_s_at 


NRXN2 


C QA 

-5.81 


O QQC ftp ft ftftKOftAlO 

o.oot-UD u.uuo^uht^ 


1 9fi 


-ft **0 1 1a13 


21 203476_at 


TPBG 


O 7Q 

-8.73 


c qhc ftft ft ftftftftftCPA 

0.81 fc-Ub U.UUbb00o4 


i 9ft 


-ft 90 6fi14-n15 


22 228580_at 


HTRA3 


-5.^1 


A ftQC ftft ft nnKOi OftO 

4.uyt-ub u.uuoonzoz 


-1 99 


-R 90 4n16 1 


23 214651_s_at 


HOXA9 


-49.38 


Q ftQC nft ft ftft7/tftAO 

o.uot-Ub u.uu f ^-u^fy 


- I .OO 


-fi 1fi 7n15-n14 


24 219463_at 


C20orf103 


-8.71 


A OftC ftfi ft ftftOftftOPQ 

l.yUt-ub u.uuouuzoy 


- I . IU 


-R nn 9nni 9 


25 204304_s_at 


PROML1 


-8.91 


-7 C7r 00 O ftft7Q*l7C*7 


i iP 


QR An1R ^ 


26 222699_s_at 


FLJ13187 


2.10 


H <inc ftC ft ftftQ<1>lftftO 

i.iyt-uo u.uuyi4uu^ 


i m 

I . ID 


Q9 ftft99 1 


27 207030_s_at 


CSRP2 


-5.17 


O.uyt-Ub U.UUDDOOO*f 


1 i9 
- I . 


Rft 19n91 1 


28 232201_at 


NKD2 


-3.56 


a f\7c ric ft nnoC7>l Q7 


i i R 


_c 7Q c n i c q 
-o./s op IO.O 


29 213147_at 


HOXA10 


a a r\r\ 

-14.90 


A r> AT~ oc O f\A A C\A 7 A tZ 

1 .d4c-0O U.U1 1 U4 / 40 




g Tp 7n1R-n14 
-o.fO #pio-pi*t 


30 238750_at 




-3.27 


O OQtr ftC ft ftftKftC007 
O.08 t-Ub U.UUOUOZZf 


i rm 
- 1 .uo 


_c 77 


31 212526_at 


SPG20 


-5.68 


C -1 A C ftC ft ftftPR^CsPA 

b. i4fc:-Ub U.UUDDO0O*fr 


a 07 

- 1 .Uf 


-*> 7^ 1^n1^ 1 


32 237439_at 


FLJ30626 


o oo 

-3.03 


"7 HOC ftC ft ftft7Q*f 7*^"7 

/ . 1 0 t-Ub u.uu (o \ (o( 


A ftQ 

- 1 .uo 


71 17n1^ 1 

— O. f 1 1 I p 1 <J* 1 


33 22281 2_s_at 


ARHF 


2.02 


ft CftC ft£5 ft ftftTQ^NKPO 

y. by t-Ub u.uu/ ybDoz 


i n.7 

1 .U/ 


c ftp i9n94 ^1 

O.OO \c.\\£.*t*\j 1 


34 211126_s_at 


CSRP2 


O TO 

-3.72 


*7 cnc ftft ft ftft7017<?7 

7.&yfc-ub u.uu / o\ (Of 


i Oft 

- 1 .uo 


-O.OO I I. 1 


35 218535_s_at 


FLJ11159 


o oc 

-2.25 


c occ ftc ft nnccQQOf\ 
o.2bfc-Ub U.UUbooo^o 


i n** 
- 1 .uo 


.c ftq cjo1^ 
-o.oo 04 1 J 


36 228855_at 




-4.78 


A QCC ftC ft fti •! COPftQ 

i .oofc-uo u.un i ouooy 


i i9 


C CO 

-0.057 


37 240581_at 




-4.44 


6.23E-06 0.00665584 


i H9 
- 1 .UZ 


_C Cft 


38 232298_at 




-5.21 


8.29E-06 0.0074049 


1 09 
- I .UZ 


-R R9 
-0.0^. 


39 213150_at 


HOXA10 


-44.22 


3.19E-05 0.01671744 


i 9i 


„c aq 7n1R-n14 


40 223475_at 


LOC83690 


-6.81 


2.35E-05 0.01406642 


i no 
- 1 .uy 


A7 ftn1*3 ^ 


41 230441_at 


KIAA1909 


3.52 0.00016116 0.03651205 


i 9*=; 
I .zo 


c Aft Rn15 33 


42 241394_at 




-3.50 


8.73E-06 0.00745405 


i nn 
- I .uu 


-O.HO 


43 213894_at 


KIAA0960 


-4.48 


8.42E-06 0.0074049 


i nn 
- 1 .uu 


c a*> 7n91 3 


44 212856_at 


KIAA0767 


-2.76 


1.66E-05 0.01104745 


A ftO 

-1 .02 


c 00 oooi Q Qi 
-O.oy ZZq lo.o I 


45 235753_at 




-7.26 


3.73E-05 0.01860589 


-1.11 


-5.36 


46 228083_at 


CACNA2D4 


-4.72 


1.13E-05 0.00888438 


-0.97 


-5.33 12p13.33 


47 201151_s_at 


MBNL1 


-1.90 


1.72E-05 0.0111503 


-1.00 


-5.32 3q25 


48 209905_at 


HOXA9 




5.08E-05 0.02175935 


-1.20 


-5.28 7p15-p14 






142.96 








49 241985 at 


FLJ37870 


3.59 


7.40E-05 0.02572554 


1.05 


5.28 5q13.3 


50 233500_x_at LLT1 


-3.00 


1.44E-05 0.01051093 


-0.97 


-5.27 12p13 



2.10 cpreph versus kort 



# affy id HUGO name 



fc p 



q stn t Map 
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1 211990_at HLA-DPA1 8.45 2.33E-19 

2 209619_at CD74 12.48 1.48E-16 

3 213539_at CD3D -32.38 2.65E-08 

4 208690_s_at PDLIM1 10.92 3.28E-12 

5 215933_s_at HHEX 8.73 1.22E-11 

6 241871_at -10.32 5.94E-09 

7 210982_s_at HLA-DRA 20.71 4.54E-11 

8 227584_at 13.39 2.08E-10 

9 217478_s_at H LA-DMA 11.76 1.23E-10 

10 208894_at HLA-DRA 17.95 1.81E-10 

11 218029_at FLJ13725 6.85 7.49E-11 

12 210349_at CAMK4 -4.82 5.62E-08 

13 201137_s_at HLA-DPB1 10.97 2.81E-10 

14 211991_s_at HLA-DPA1 23.54 6.66E-10 

15 204670_x_at HLA-DRB5 7.70 1.22E-10 

16 217979_at NET-6 8.46 2.11E-10 

17 202789_at -3.43 1.96E-08 

18 204689_at HHEX 5.78 1.39E-10 

19 203708_at PDE4B 8.45 2.80E-10 

20 229390_at 7.33 1.55E-10 

21 208306_x_at HLA-DRB4 9.66 1.25E-09 

22 20931 2_x_at HLA-DRB1 8.00 1.09E-09 

23 227077_at -4.13 1.20E-07 

24 224925_at PRexl 8.89 4.29E-09 

25 221969_at PAX5 7.71 3.90E-09 

26 212998_x_at HLA-DQB1 22.57 5.40E-09 

27 20101 5_s_at JUP 27.52 6.09E-09 

28 202207_at ARL7 -6.81 2.80E-07 

29 201721_s_at LAPTM5 2.32 5.06E-09 

30 224774_s_at NAV1 13.55 1.39E-08 

31 209771_x_at CD24 5.38 1.29E-09 

32 224772_at NAV1 9.61 3.77E-09 

33 215193_x_at HLA-DRB1 11.61 9.51 E-09 

34 229029_at -11.30 2.89E-07 

35 222895_s_at BCL11B -16.00 1.31E-06 

36 216379_x_at KIAA1919 5.86 2.68E-09 

37 221581_s_at WBSCR5 9.93 1.70E-08 

38 224909_s_at PRexl 4.66 3.13E-09 

39 22471 0_at RAB34 5.39 9.57E-10 

40 213082_s_at SQV7L 6.07 3.06E-09 

41 209760_at KIAA0922 -2.96 2.14E-07 

42 209732_at CLECSF2 3.03 1.60E-09 

43 225129_at CPNE2 6.31 1.96E-08 

44 203932_at HLA-DMB 6.70 5.23E-09 

45 213817_at 8.87 7.46E-09 

46 201161_s_at CSDA 3.47 2.43E-09 

47 223380 s at LATS2 5.21 1.22E-08 



PCT/EP2004/012459 
Location 

5.27E-15 2.87 17.72 6p21 .3 

1.68E-12 2.84 16.79 5q32 

1.11E-05 -2.94 -11.71 11q23 

2.48E-08 2.01 11 .70 1 0q22-q26.3 

6.91 E-08 2.04 11.51 10q23.32 

4.18E-06 -2.21 -11.22 

2.06E-07 2.03 11.13 6p21.3 

3.69E-07 2.05 10.62 

3.49E-07 1 .91 1 0.56 6p21 .3 

3.69E-07 1.93 10.48 6p21.3 

2.83E-07 1.77 10.24 16q21 

1.96E-05 -2.14 -10.18 5q21.3 

4.25E-07 1.73 9.83 6p21.3 

9.44E-07 1.80 9.80 6p21.3 

3.49E-07 1.65 9.74 6p21.3 

3.69E-07 1.64 9.62 7p21.1 

9.25E-06 -1.79 -9.59 

3.51 E-07 1 .56 9.39 1 0q23.32 

4.25E-07 1.57 9.30 1p31 

3.51 E-07 1.51 9.16 

1.46E-06 1.53 8.92 6p21.3 

1.37E-06 1-50 8.85 6p21.3 

3.10E-05 -1.70 -8.81 

3.47E-06 1.59 8.81 20q13.13 

3.28E-06 1.57 8.77 9p1 3 

3.95E-06 1.60 8.76 6p21.3 

4.18E-06 1-58 8.67 17q21 

5.46E-05 -1.71 -8.59 2q37.2 

3.95E-06 1.44 8.51 1p34 

7.33E-06 1.61 8.46 

1.46E-06 1.40 8.44 6q21 

3.28E-06 1.44 8.42 

5.53E-06 1.49 8.36 6p21.3 

5.56E-05 -1.61 -8.32 

0.00016349 -1.96 -8.29 14q32.31 

2.64E-06 1.39 8.28 6q22 

8.39E-06 1.52 8.26 7q1 1.23 

2.84E-06 1.37 8.21 20q13.13 

1.28E-06 1.33 8.21 17q11.1 

2.84E-06 1.36 8.19 9q22.31 

4.67E-05 -1.52 -8.15 4q31.3 

1.72E-06 1.31 8.07 12p13-p12 

9.25E-06 1.45 8.07 16q1 2.2 

3.95E-06 1.35 8.06 6p21.3 

4.75E-06 1.34 7.96 

2.51E-06 1.29 7.93 12p13.1 

6.75E-06 1.35 7.92 13q11-q12 
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48 226459_at FLJ35564 

49 226878_at 

50 37384_at PPM1F 



4.53 9.18E-09 
6.64 2.44E-08 
3.92 3.58E-08 



5.48E-06 1.33 7.88 10q23.33 
1.09E-05 1.38 7.86 
1.40E-05 1.42 7.85 22q1 1.22 



2.11 cpreph versus pret 



# affyid HUGO name fc p 

1 211990_at HLA-DPA1 7.99 

2 210982_s_at HLA-DRA 26.21 

3 208894_at HLA-DRA 25.74 

4 204670_x_at HLA-DRB5 10.16 

5 217478_s_at H LA-DMA 10.24 

6 209771_x_at CD24 14.60 

7 201137_s_at HLA-DPB1 14.15 

8 211991_s_at HLA-DPA1 26.38 

9 216379_x_at KIAA1919 14.23 

10 227584_at 7.36 

11 20931 2_x_at HLA-DRB1 10.95 

12 208306_x_at HLA-DRB4 12.34 

13 221000_s_at FKSG28 7.72 

14 221969_at PAX5 9.85 

15 212998_x_at HLA-DQB1 32.62 

16 215193_x_at HLA-DRB1 19.74 

17 201161_s_at CSDA 6.72 

18 203932_at HLA-DMB 7.47 

19 229487_at 11-49 

20 224772_at NAV1 8.13 

21 20211 3_S_at SNX2 4.79 

22 20961 9_at CD74 5.32 

23 224774_s_at NAV1 11-56 

24 266_s_at CD24 19.99 

25 211336_x_at LILRB1 9 21 

26 208650_s_at CD24 23.01 

27 208651_x_at .CD24 8.91 

28 227998_at MGC17528 13.10 

29 213537_at HLA-DPA1 26.48 

30 219686_at HSA250839 41.30 

31 226878_at 5.66 

32 203543_s_at BTEB1 23.84 

33 203603_s_at ZFHX1B 4.20 

34 223046_at EGLN1 6.14 

35 200696_s_at GSN 6.09 

36 202114_at SNX2 4.22 

37 209238_at STX3A 3.63 

38 207697_x_at LILRB2 4.60 

39 219271_at GalNac-T10 8.35 



q 


stn 


t 


Map 
Location 


1.27E-09 


2.26E-06 


2.52 


13.01 6p21.3 


3.82E-11 


6.71 E-07 


2.12 


1 1 .26 6p21 .3 


1.49E-10 


6.71 E-07 


2.07 


10.73 6p21.3 


8.84E-11 


6.71 E-07 


1.90 


10.39 6p21.3 


1.21E-10 


6.71 E-07 


1.85 


10.17 6p21.3 


9.03E-11 


6.71 E-07 


1.82 


10.10 6q21 


1.76E-10 


6.71 E-07 


1.81 


9.96 6p21 .3 


6.35E-10 


1.62E-06 


1.87 


9.87 6p21 .3 


3.21 E-10 


1.05E-06 


1.75 


9.66 6q22 


5.02E-10 


1.44E-06 


1.77 


9.65 


8.13E-10 


1.86E-06 


1.79 


9.59 6p21.3 


1.20E-09 


2.26E-06 


1.79 


9.48 6p21 .3 


1.28E-09 
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